Search CORE

1,017,253 research outputs found

Machine Learning

Author: Introduction to Data Mining
Publication venue
Publication date
Field of study

Predicting credit defaul

ZENODO

Data Mining

Author: Parker Julian
Sloan Terence
Yau Hon
Publication venue
Publication date: 01/01/1998
Field of study

Edinburgh Research Explorer

First CLADAG data mining prize : data mining for longitudinal data with different marketing campaigns

Author: Akacha Mouna
Fonseca Thaís C. O.
Liverani Silvia
Publication venue: University of Warwick. Centre for Research in Statistical Methodology
Publication date: 01/01/2009
Field of study

The CLAssification and Data Analysis Group (CLADAG) of the Italian Statistical Society recently organised a competition, the 'Young Researcher Data Mining Prize' sponsored by the SAS Institute. This paper was the winning entry and in it we detail our approach to the problem proposed and our results. The main methods used are linear regression, mixture models, Bayesian autoregressive and Bayesian dynamic models

Warwick Research Archives Portal Repository

Data-mining chess databases

Author: Bleicher Eiko
Haworth Guy McCrossan
van der Heijden Harold M J F
Publication venue: The International Computer Games Association
Publication date: 31/12/2010
Field of study

This is a report on the data-mining of two chess databases, the objective being to compare their sub-7-man content with perfect play as documented in Nalimov endgame tables. Van der Heijden’s ENDGAME STUDY DATABASE IV is a definitive collection of 76,132 studies in which White should have an essentially unique route to the stipulated goal. Chessbase’s BIG DATABASE 2010 holds some 4.5 million games. Insight gained into both database content and data-mining has led to some delightful surprises and created a further agenda

Central Archive at the University of Reading

Product-Driven Data Mining

Author: Bohun C. Sean
Publication venue
Publication date: 01/01/2003
Field of study

Manifold Data Mining has developed innovative demographic and household spending pattern databases for six-digit postal codes in Canada. Their collection of information consists of both demographic and expenditure variables which are expressed through thousands of individually tracked factors. This large collection of information about consumer behaviour is typically referred to as a mine. Although very large in practice, for the purposes of this report, the data mine consisted of

m

individuals and

n

factors where

m \sim 2000

and

n \sim 50

. Ideally, the first algorithm would identify a few factors in the data mine which would differentiate customers in terms of a particular product preference. Then the second algorithm would build on this information by looking for patterns in the data mine which would identify related areas of consumer spending. To test the algorithms two case studies were undertaken. The first study involved differentiating BMW and Honda car owners. The algorithms developed were reasonably successful at both finding questions that differentiate these two populations and identifying common characteristics amongst the groups of respondents. For the second case study it was hoped that the same algorithms could differentiate between consumers of two brands of beer. In this case the first algorithm was not as successful as differentiating between all groups; it showed some distinctions between beer drinkers and non-beer drinkers, but not as clearly defined as in the first case study. The second algorithm was then used successfully to further identify spending patterns once this distinction was made. In this second case study a deeper factor analysis could be used to identify a combination of factors which could be used in the first algorithm